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INTRODUCTION AND MOTIVATION APPROACH 


Our project will focus on taking Canada’s 
historical precipitation data to analyze patterns, 
predict future trends, and strategically identify 
regions with surplus water to mitigate drought 
and replenish water supply in drought areas using 
classification machine learning models. 


ALGORITHMS 


e Prediction: Decision Tree, Logistic regression 

e Allocation: Euclidean distance, Kuhn-Munkres 

e Visualizations: JavaScript D3, Folio (leaflet), 
geopandas 


DATA COLLECTION 


e Source: Canada’s Historical Weather Data 
Archive: https://climate.weather.gc.ca/ 

e Data is available per station per day, in 
separate CSV files 

e This is scraped and consolidated together 
with python scripts. 

e Total observations: 964,638 

e Total Stations: 1370 

e Total Time Period: 2 years (2022,2023) 

e Total Features/Columns: 19 


EXPERIMENTS AND RESULTS 


Decision Tree 
F1 Score: 0.919368 
Time to Train: 53.365 


Logistic Regression 
F1 Score: 0.8098157 
Time to Train: 75.45 


Overall accuracy of the Decision Tree 
suggests a more robust model. 


Confusion Matrix for Multilevel Decision Tree 
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Predicted labels 


Our project approach is to use ML classification models to predict a labelled dataset. The following 
steps will be followed: 


e Collect 2 years precipitation data from Canada’s historical weather archive 

e Label data with ‘surplus’, ‘drought’, ‘stable’ based on a 60-day threshold 

e Train 2 Classification Models with 80/20 split train/test data 

e Evaluate Performance using Confusion Matrices and F1 Scores 

e Create a Visual Interactive Ul which will use the best performing model to identify areas of drought 
and surplus to help with logistical planning of water distribution. 


Our intended innovations for this project involve utilizing a more integrated 
approach to historical precipitation levels, future precipitation levels and the 
inclusion of prescriptive strategies for water management and redistribution. 


EXPLORATORY DATA VISUALIZATION 


Average Precipitation by Month, Grouped by Province Histogram of Average Precipitation by Station 
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VISUALIZATION AND Ul 


The Visualization UI created allows the user to select a province, a min temperate and a max 
temperature and month of year to predict — 1) A heatmap of the precipitation levels, 2) the closest 
stations with the highest and lowest level of precipitation 3) The overall predicted precipitation for 
the parameters chosen. 


Canada Precipitation Predictor 


Select Province: Min Temperature (°C): Max Temperature (°C): Month: 


| Predict 


British Columbia v | -30 ~ | 40 ~ February ~ 


Station with the lowest precipitation: SPARWOOD CS with 0.1 mm 


sppee_>» 5 closest stations with the highest precipitation: 
- Baffin 
op e SPARWOOD A (Distance: 0.08 km, Precipitation: 57.2 mm) 


e FERNIE (Distance: 0.32 km, Precipitation: 96 mm) 

e FT STEELE DANDY CRK (Distance: 0.62 km, Precipitation: 33.6 mm) 

e CRANBROOK A (Distance: 0.91 km, Precipitation: 26.5 mm) 

e CRANBROOK AIRPORT AUTO (Distance: 0.92 km, Precipitation: 
25.9 mm) 


Station with the highest precipitation: LENNARD ISLAND with 244.8 
mm 


5 closest stations with the lowest precipitation: 


e SPARWOOD CS (Distance: 11.06 km, Precipitation: 0.1 mm) 
e CAPE ST JAMES (Distance: 5.83 km, Precipitation: 0.1 mm) 
e PORT MELLON (Distance: 2.46 km, Precipitation: 0.1 mm) 

e PRINCETON A (Distance: 5.42 km, Precipitation: 0.1 mm) 

¢ HOPE AIRPORT (Distance: 4.44 km, Precipitation: 0.1 mm) 


Based on historical data, the predicted precipitation for British Columbia 
with a minimum temperature of -30 and a maximum temperature of 40 in 
the month of February is: 


7.14 mm 


Using the stations with the highest and lowest precipitation amounts, we created a node-based 
representation of how water distribution can be accomplished, by pairing drought and surplus nodes 
in a way that minimizes the total Euclidean distance between station pairs. 

[Key: Blue = Surplus, Red = Drought] 


Drought Node Avg Precip Surplus Node Avg Precip Distance (Miles) 


i] Pair 
CALMAR 
(Alberta) 
FORT LIARD A 
(Northwest Territories) 


NITINAT RIVER HATCHERY 
(British Columbia) 


LAC STE CROIX 
(Quebec) 


0.080179 9.028478 419.468727 


0.165002 12.000000 558.629269 


KILLAM AGDM 
(Alberta) 


LITTLE CHICAGO 


TAHSIS VILLAGE NORTH 


0.155469 (British Columbia) 17.346833 1636.571568 


HALIFAX KOOTENAY 


i anaua OI ‘Nowe Scotigy 10.087459 1979.096483 
LOWER CARP LAKE PORT RENFREW 

5 Northwest Tenitories) 0.088973 Griish Coumbiay 11-1117127 2333.360461 

6 MOULD BAY CS 0.177664 BOAT BLUFF 44703418 1946.257852 


(Northwest Territories) (British Columbia} 


MYRNAM LITE 
(Alberta) 


RADWAY AGCM 
(Alberta) 


ZEBALLOS MURAUDE CREEK 


0.011787 (British Columbia} 


8.845884 1855.099448 


IBERVILLE 


0.141118 (Quebec) 


15.000000 2486.435492 


SASKATOON INTL A 
(Saskatchewan) 


ST. MARY RESERVOIR 
(Alberta) 


PRINCE RUPERT MONT CIRC 
(British Columbia} 


GODBOUT 
(Quebec) 


0.147781 13.789217 1539.968635 


10 0.121622 14.000000 2420.839874 


